Mobile Legends: Bang Bang is a MOBA game (Multiplayer online battle arena) for mobile devices with Android and iOS developed by Shanghai Moonton. The game was originally released in Asia on 11th of June 2016.
In the game there are 2 opposing teams consisting of 5 players each. Players choose a character they will play with before game starts. As for now there are around 90 champions (character) to choose from. Each character is different and may be used for different purposes depending on their skills and abillities. In that way one can distinguish mages, assasins, fighters, supports, tanks and marksmen. Main task is to destroy enemies’ defence towers resulting in concquering their base (yup, it’s very boring but somehow very popular).
The game was getting more and more attention in Poland for a couple of years now. The graph below presents interest over time for google query “Mobile Legends” and “MOBA” in Poland. As you can see around 2017 there was a hugh increase in popularity of Mobile Legends while interest in MOBA games in general was falling down gradually in past 5 years. However in March and April 2020 they experienced a rapid renaissance. We can probably associate it at least in part with a lockdown caused by COVID-19 outbreak.
As stated above there are several types of characters in that game so we will try to label them based on their characteristics. In order to do so we are going to implement Principal Component Analysis to reduce dimentionallity and then a hierarchical algorithm to cluster them. Although the labels are known such the analysis may be helpful for maintaining characters’ skillsets in a balance way.
First we have to collect the data. As there is no official site with the data on champions characteristics we will scrape it from mobile league wiki site. Let’s check robot.txt file before we start.
paths_allowed("https://mobile-legends.fandom.com/wiki/Mobile_Legends_Wiki")
The upper command returns value TRUE. That’s nice - we are allowed to scrape their data. For that purpose we will combine rvest package and selector gadget widget. Whole scraping/wrangling code is provided in a speparate Rmd file in gitHub repository.
Let’s have a look on how our data looks like. In the table below you can find all characters in alphabetical order.
One important remark is although the list below present all playable characters right now we will consider it as a sample since the characters set is being constantly updated with new characters - in that way statistical inference can be justified.
| Id | Hero | Movement speed | Magic Resistance | Mana | HP Regen Rate | Physical Attack | Armor | Health points | Attack speed | Mana regen rate | Role |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Akai | 260 | 10 | 422 | 42 | 115 | 24 | 2769 | 0.8500 | 12 | Tank |
| 2 | Aldous | 260 | 10 | 405 | 45 | 129 | 22 | 2718 | 0.8360 | 18 | Fighter |
| 3 | Alice | 240 | 10 | 493 | 36 | 114 | 21 | 2573 | 0.8000 | 18 | Mage |
| 4 | Alpha | 260 | 10 | 453 | 39 | 121 | 20 | 2646 | 0.9160 | 16 | Fighter |
| 5 | Alucard | 260 | 10 | 0 | 39 | 123 | 21 | 2821 | 0.9000 | 0 | Fighter |
| 6 | Angela | 240 | 10 | 515 | 34 | 115 | 15 | 2421 | 0.7920 | 18 | Support |
| 7 | Argus | 260 | 10 | 0 | 40 | 124 | 21 | 2628 | 0.9160 | 0 | Fighter |
| 8 | Atlas | 240 | 10 | 440 | 42 | 135 | 0 | 2819 | 0.7860 | 15 | Tank |
| 9 | Aurora | 245 | 10 | 500 | 34 | 105 | 17 | 2441 | 0.8000 | 23 | Mage |
| 10 | Badang | 255 | 10 | 0 | 40 | 119 | 23 | 2708 | 0.9080 | 0 | Fighter |
| 11 | Balmond | 260 | 10 | 0 | 47 | 119 | 25 | 2836 | 0.8500 | 0 | Fighter |
| 12 | Bane | 260 | 10 | 433 | 42 | 117 | 23 | 2659 | 0.8500 | 12 | Fighter |
| 13 | Belerick | 250 | 10 | 450 | 62 | 110 | 20 | 3109 | 0.8100 | 12 | Tank |
| 14 | Bruno | 240 | 10 | 439 | 30 | 128 | 17 | 2522 | 0.8500 | 15 | Marksman |
| 15 | Carmilla | 197 | 13 | 477 | 45 | 118 | 10 | 2378 | NA | 34 | Support |
| 16 | Cecilion | 265 | 15 | 574 | 32 | 165 | 23 | 2425 | NA | 26 | Mage |
| 17 | Chang%27e | 240 | 10 | 505 | 34 | 115 | 16 | 2301 | 0.8080 | 21 | Mage |
| 18 | Chou | 260 | 10 | 0 | 39 | 121 | 23 | 2708 | 0.8840 | 0 | Fighter |
| 19 | Claude | 240 | 10 | 450 | 40 | 137 | 14 | 2370 | 0.8260 | 15 | Marksman |
| 20 | Clint | 240 | 10 | 450 | 36 | 115 | 20 | 2530 | 0.8420 | 15 | Marksman |
| 21 | Cyclops | 240 | 10 | 500 | 38 | 112 | 18 | 2521 | 0.8000 | 20 | Mage |
| 22 | Diggie | 250 | 10 | 490 | 36 | 115 | 18 | 2351 | 0.8000 | 20 | Support |
| 23 | Dyrroth | 266 | 10 | 0 | 41 | 117 | 19 | 2758 | 0.9160 | 0 | Fighter |
| 24 | Esmeralda | 240 | 10 | 502 | 36 | 114 | 21 | 2573 | 0.8000 | 20 | Mage |
| 25 | Estes | 240 | 10 | 545 | 36 | 115 | 13 | 2161 | 0.8000 | 18 | Support |
| 26 | Eudora | 250 | 10 | 468 | 38 | 112 | 19 | 2524 | 0.8000 | 16 | Mage |
| 27 | Fanny | 265 | 10 | 0 | 33 | 126 | 17 | 2526 | 0.8940 | 0 | Assassin |
| 28 | Faramis | 260 | 10 | 0 | 39 | 222 | 36 | 3700 | 0.9400 | 19 | Support |
| 29 | Franco | 260 | 10 | 440 | 46 | 116 | 25 | 2709 | 0.8260 | 10 | Tank |
| 30 | Freya | 260 | 10 | 462 | 49 | 109 | 22 | 2801 | 0.8760 | 14 | Fighter |
| 31 | Gatotkaca | 260 | 10 | 440 | 42 | 120 | 20 | 2709 | 0.8180 | 12 | Tank |
| 32 | Gord | 240 | 10 | 570 | 32 | 110 | 13 | 2478 | 0.7720 | 25 | Mage |
| 33 | Granger | 240 | 10 | 0 | 27 | 125 | 15 | 2490 | 0.8180 | 0 | Marksman |
| 34 | Grock | 260 | 10 | 430 | 42 | 135 | 21 | 2819 | 0.8100 | 42 | Tank |
| 35 | Guinevere | 260 | 10 | 0 | 39 | 126 | 18 | 2528 | 0.9160 | 0 | Fighter |
| 36 | Gusion | 260 | 10 | 469 | 39 | 119 | 18 | 2578 | 0.8920 | 16 | Assassin |
| 37 | Hanabi | 245 | 10 | 390 | 30 | 115 | 17 | 2510 | 0.8500 | 15 | Marksman |
| 38 | Hanzo | 260 | 10 | 0 | 35 | 118 | 17 | 2594 | 0.8700 | 0 | Assassin |
| 39 | Harith | 240 | 10 | 490 | 36 | 114 | 19 | 2701 | 0.8400 | 18 | Mage |
| 40 | Harley | 240 | 10 | 490 | 36 | 114 | 19 | 2501 | 0.8480 | 18 | Mage |
| 41 | Hayabusa | 260 | 10 | 0 | 37 | 117 | 17 | 2629 | 0.8540 | 0 | Assassin |
| 42 | Helcurt | 255 | 10 | 440 | 35 | 121 | 17 | 2559 | 0.8700 | 16 | Assassin |
| 43 | Hilda | 260 | 10 | 0 | 42 | 123 | 24 | 2709 | 0.8420 | 0 | Fighter |
| 44 | Hylos | 260 | 10 | 430 | 42 | 105 | 17 | 3309 | 0.8360 | 12 | Tank |
| 45 | Irithel | 260 | 10 | 438 | 35 | 110 | 17 | 2540 | 0.8260 | 15 | Marksman |
| 46 | Jawhead | 255 | 10 | 430 | 39 | 119 | 24 | 2778 | 0.9000 | 16 | Fighter |
| 47 | Johnson | 255 | 10 | 0 | 42 | 112 | 27 | 2809 | 0.8260 | 12 | Tank |
| 48 | Kadita | 240 | 10 | 495 | 34 | 105 | 18 | 2491 | 0.8000 | 18 | Mage |
| 49 | Kagura | 240 | 10 | 519 | 35 | 118 | 19 | 2556 | 0.8160 | 21 | Mage |
| 50 | Kaja | 270 | 10 | 400 | 52 | 120 | 30 | 2609 | 0.8420 | 12 | Fighter |
| 51 | Karina | 260 | 10 | 431 | 39 | 121 | 20 | 2633 | 0.9000 | 16 | Assassin |
| 52 | Karrie | 240 | 10 | 440 | 40 | 112 | 17 | 2498 | 0.8396 | 15 | Marksman |
| 53 | Khufra | 255 | 0 | 460 | 47 | 117 | 19 | 2709 | 0.7860 | 15 | Tank |
| 54 | Kimmy | 245 | 10 | 100 | 40 | 104 | 22 | 2450 | 0.8260 | 0 | Marksman |
| 55 | Lancelot | 260 | 10 | 450 | 35 | 124 | 16 | 2549 | 0.8700 | 16 | Assassin |
| 56 | Lapu-Lapu | 260 | 10 | 0 | 35 | 119 | 21 | 2628 | 0.9000 | 16 | Fighter |
| 57 | Layla | 240 | 10 | 424 | 27 | 130 | 15 | 2500 | 0.8500 | 14 | Marksman |
| 58 | Leomord | 240 | 10 | 0 | 35 | 128 | 25 | 2738 | 0.8440 | 0 | Fighter |
| 59 | Lesley | 240 | 10 | 0 | 36 | 115 | 14 | 2490 | 0.8260 | 0 | Marksman |
| 60 | Ling | 260 | 10 | 0 | 39 | 119 | 18 | 2578 | 0.8920 | 0 | Assassin |
| 61 | Lolita | 260 | 10 | 480 | 48 | 115 | 27 | 2679 | 0.7860 | 12 | Tank |
| 62 | Lunox | 240 | 10 | 540 | 34 | 115 | 15 | 2521 | 0.8080 | 23 | Mage |
| 63 | Lylia | 245 | 10 | 500 | 34 | 113 | 17 | 2501 | 0.8080 | 19 | Mage |
| 64 | Martis | 260 | 10 | 405 | 35 | 128 | 25 | 2738 | 0.8680 | 16 | Fighter |
| 65 | Masha | 312 | 10 | 101 | 19 | NA | 12 | 1948 | NA | 0 | Fighter |
| 66 | Minotaur | 260 | 10 | 0 | 44 | 123 | 23 | 2759 | 0.7300 | 0 | Tank |
| 67 | Minsitthar | 260 | 10 | 380 | 37 | 121 | 23 | 2698 | 0.8520 | 16 | Fighter |
| 68 | Miya | 240 | 10 | 445 | 30 | 129 | 17 | 2524 | 0.8500 | 15 | Marksman |
| 69 | Moskov | 240 | 10 | 420 | 32 | 125 | 16 | 2530 | 0.8140 | 15 | Marksman |
| 70 | Nana | 250 | 10 | 510 | 34 | 115 | 17 | 2501 | 0.8640 | 18 | Mage |
| 71 | Natalia | 260 | 10 | 486 | 35 | 121 | 18 | 2589 | 0.9020 | 16 | Assassin |
| 72 | Odette | 240 | 10 | 495 | 34 | 105 | 18 | 2491 | 0.8000 | 23 | Mage |
| 73 | Pharsa | 240 | 10 | 490 | 34 | 109 | 15 | 2421 | 0.7900 | 18 | Mage |
| 74 | Rafaela | 245 | 10 | 545 | 36 | 117 | 15 | 2441 | 0.7920 | 23 | Support |
| 75 | Roger | 240 | 10 | 450 | 36 | 128 | 22 | 2730 | 0.8420 | 15 | Fighter |
| 76 | Ruby | 260 | 10 | 430 | 30 | 114 | 23 | 2859 | 0.8580 | 14 | Fighter |
| 77 | Saber | 260 | 10 | 443 | 35 | 118 | 17 | 2599 | 0.8700 | 16 | Assassin |
| 78 | Selena | 240 | 10 | 490 | 34 | 110 | 15 | 2401 | 0.8040 | 18 | Assassin |
| 79 | Silvanna | 255 | 10 | 430 | 39 | 126 | 22 | 2828 | 0.9160 | 16 | Fighter |
| 80 | Sun | 260 | 10 | 400 | 41 | 114 | 23 | 2758 | 0.9160 | 16 | Fighter |
| 81 | Terizla | 255 | 10 | 0 | 54 | 129 | 19 | 2728 | 0.8200 | 0 | Fighter |
| 82 | Thamuz | 255 | 10 | 0 | 39 | 123 | 24 | 2758 | 0.8600 | 0 | Fighter |
| 83 | Tigreal | 260 | 10 | 450 | 42 | 112 | 25 | 2890 | 0.8260 | 12 | Tank |
| 84 | Uranus | 260 | 10 | 455 | 32 | 115 | 20 | 2689 | 0.8340 | 12 | Tank |
| 85 | Vale | 250 | 10 | 490 | 34 | 115 | 15 | 2401 | 0.8000 | 21 | Mage |
| 86 | Valir | 245 | 10 | 495 | 34 | 105 | 18 | 2516 | 0.8000 | 18 | Mage |
| 87 | Vexana | 245 | 10 | 490 | 38 | 112 | 17 | 2421 | 0.8000 | 20 | Mage |
| 88 | Wanwan | 240 | 0 | 424 | 27 | 100 | 0 | 2540 | 0.8260 | 14 | Marksman |
| 89 | X.Borg | 260 | 10 | 0 | 39 | 117 | 25 | 1138 | 0.8680 | 0 | Fighter |
| 90 | Yi_Sun-Shin | 240 | 10 | 438 | 36 | 110 | 18 | 2520 | 0.8000 | 15 | Marksman |
| 91 | Zhask | 240 | 10 | 490 | 34 | 107 | 15 | 2401 | 0.8000 | 20 | Mage |
| 92 | Zilong | 265 | 10 | 405 | 35 | 123 | 25 | 2689 | 0.9640 | 16 | Fighter |
One important thing we should be interested in is the variability of champions characteristics. Below you can see the coefficient of variation (in %).
| Movement speed | Magic Resistance | Mana | HP Regen Rate | Physical Attack | Armor | Health points | Attack speed | Mana regen rate |
|---|---|---|---|---|---|---|---|---|
| 5.04 | 16.19 | 59.37 | 15.99 | 11.76 | 26.35 | 10.18 | 5.18 | 63.65 |
The variabiliy of mana and mana regeneration exeed 60% whereas for health points regeneration and armor it’s about 16% and 26% responsively. We will have to check the data for outliers as those values might be inflated for instance just by a single observation. Rest of the variables vary just a bit (most of them under 10%). The value for magic resistance is constant almost for every character so we will drop that variable in further analysis.
Now let’s look for some possible relationships and check distributions of the variables.
We can see some relationships - f.e. mana vs. mana regeneration and health points vs. armor and many more - we will investigate them soon.
Density functions for variables movement speed, mana and mana regenerations seem to be bimodal - it is clear sign there are some subpopulations in our “sample” so it is reasonable to conduct cluster analysis.
There are some outliers - note a champion whose health point regeneration ability is about 2 times more powerful than the mean for the sample. We can also see a champion whose health points ability and attack points are extremly high. For the sake of analysis we will remove both of them from our “sample” so that they will not affect clustering results in a significant way. Let’s find out who are those people.
| Id | Hero | Movement speed | Magic Resistance | Mana | HP Regen Rate | Physical Attack | Armor | Health points | Attack speed | Mana regen rate | Role |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 13 | Belerick | 250 | 10 | 450 | 62 | 110 | 20 | 3.109 | 0.81 | 12 | Tank |
| 28 | Faramis | 260 | 10 | 0 | 39 | 222 | 36 | 3.700 | 0.94 | 19 | Support |
The last thing we can do is to check the correlations and their significance - just to have general view since Simson paradox might be present.
Dealing with high dimentional data might be challenging and can lead to several problems. However in most cases it is possible to reduce the number of dimentions retaining most of the information stored in the data. One of the most widely used method that can allow us to do so is Principal Component Analysis. So what we basically want to do is to project our data matrix on some reduced-feature space using a linear transformation while restoring as much information as possible. And that is exactly what PCA does!
Let’s assume we have data matrix \(X\) consisting of \(n\) variables and \(m\) observations, so \(X \in \mathbb{R}^{n \times m}\). We want to find a linear transformation \(U\) that transforms \(X\) as follows: \[Z = UX, \text{ where } Z \in \mathbb{R}^{d \times m}, U \in \mathbb{R}^{d \times n} \text{ and } d<m.\] At the same time we want make sure we mimnimize the information loss. We can think of variance-covariance matrix as a representation of information in our data. In terms of our transformed data matrix it can be denoted as \[\Sigma = \frac{1}{N}Z^TZ, \text{ where } \Sigma \in \mathbb{R}^{n\times n}.\] Keeping that in mind searching for our transformation becomes following optimisation problem: \[\max_{U}\Sigma=\max_U\frac{1}{N}(XU)^T(XU) = \max_U\frac{1}{N}U^TX^TXU=\max_UU^T\Sigma U, \text{ where } U^TU = I.\]Note that we have to add normalization condition to make sure all of the vectors have unit magnitude because otherwise we would not be able to solve this expression as there is no upper boundary. One possible way to solve such problems is Method of Lagrange Miltipliers.
Firstly we construct our Lagrange multiplier as following: \[F(U,\lambda)=U^T\Sigma U + \lambda(I-U^TU).\]
Then we differentiate it with respect to \(U\) and equate to 0 as the differential should equal 0 in extremum \[\frac{dF}{dU}=\Sigma U-\lambda U.\]
We can rewrite it as \[\Sigma U=\lambda U.\]
The later looks indeed as eigenvectors equation so what we do is perform variance-covariance matrix diagonalization (eigen-decopostion) to obtain eigenvectors and corresponding eigenvalues \[\Sigma = U \Lambda U^{-1}.\]
Then we can sort pairs of eigenvectors with their eigenvalues in descending order and choose top m pairs. In that way we come up with set of m eigenvectors that retain as much part of variance as following ratio: \[\frac{\Sigma_i^m \lambda_i}{\Sigma_i \lambda_i}.\].
Our U transformation that we are looking for is composed of the chosen eigenvectors \[U = [u_1, ..., u_m].\]
First let’s detrmine relevant prinipal components using standarized data. As scree plot would not tell us much, we should probably choose the number of compontents based on eigenvalue rule of thumg. Each of three top components has eigenvalue > 1, i.e. “contains more information than a single variable”.
| Eigenvalue | Variance percent | Cumulative variance percent | |
|---|---|---|---|
| PC1 | 3.19 | 39.84 | 39.84 |
| PC2 | 1.40 | 17.54 | 57.38 |
| PC3 | 1.02 | 12.73 | 70.11 |
| PC4 | 0.83 | 10.38 | 80.49 |
| PC5 | 0.74 | 9.21 | 89.70 |
| PC6 | 0.45 | 5.62 | 95.32 |
| PC7 | 0.25 | 3.13 | 98.45 |
| PC8 | 0.12 | 1.55 | 100.00 |
As you can see in the table above they account for about 70,1% of data variability. That is not as much as we expected but it’s fine. We droped 5 from 8 variables and still managed to retain over 70% of variability.
Let’s have a look now on the PCA loadings so we can think of some resonable interpretations.
| Variable | PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | PC7 | PC8 |
|---|---|---|---|---|---|---|---|---|
| MV_SPD | -0.454 | 0.223 | -0.065 | 0.284 | -0.003 | 0.444 | -0.673 | 0.099 |
| MANA | 0.41 | 0.469 | -0.228 | 0.169 | 0.104 | 0.111 | 0.149 | 0.697 |
| HP_RGN | -0.327 | 0.429 | 0.357 | -0.311 | 0.267 | 0.432 | 0.464 | -0.112 |
| P_ATK | -0.234 | -0.136 | -0.616 | -0.491 | 0.529 | -0.096 | -0.086 | 0.101 |
| P_DFN | -0.362 | 0.329 | 0.203 | 0.317 | 0.315 | -0.72 | -0.002 | 0.049 |
| HP | -0.223 | 0.433 | -0.255 | -0.395 | -0.695 | -0.241 | -0.004 | -0.004 |
| ATK_SPD | -0.355 | -0.14 | -0.489 | 0.526 | -0.143 | 0.137 | 0.538 | -0.099 |
| MANA_RGN | 0.398 | 0.461 | -0.304 | 0.141 | 0.188 | 0.023 | -0.113 | -0.685 |
ncomp<-8
pca_iris <- pca
rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp)
rotatedLoadings <- varimax(rawLoadings)$loadings
invLoadings <- t(pracma::pinv(rotatedLoadings))
scores <- scale(df_num) %*% invLoadings
The most obvious interpretation has definitely PC2. We can think of it as durability of a character because the loadings’ values by health points and health points regeneration are large and influence PC in one direction.
Next quite resonable interpretation would be physical strength for PC4 because of extend of attack points influence and also becasue of the opposite sign of armor variable.
Readiness to fight would be the label for PC3. Since both signs of loadings by variable mana and mana regeneration are strongly negative, we would expect all magical champions to have very low value of PC3.
As we cannot came up with anything better for PC1 let’s call it for now playability and later on maybe we will be able to review that issue.
If we would include PC5 we would probably call it defensive capability. For more precise labeling we should talk to some gamers with experience and knowledge about the game - maybe someday i will upload it with my recent dicoveries :)
Now as we reduced dimentionality we can proceed to the most exciting part of our analysis - clusters distinguishment. To do so we will implement hierarchical algorithm. First let’s start with computing distances between observation. For that purpose we will use first and second order of Minkowski metrics, i.e. Mannhatan and Euclidean distances responsively.
#hc <- hclust(d.e, method = "ward.D2")